Loading and cleaning data

This file contains time-series plots from the SoS database.

First, we load the SoS database and process the country columns. This uses a function found in the column_cleaning_workflows.R file, which matches each list of countries for each surveillance system to a set of countries.

# knitr::opts_chunk$set(message = FALSE)
library(plyr)
library(dplyr)
library(magrittr)
library(lubridate)
library(ggplot2)
library(devtools)

load_all()

data(sos_raw)

import_country_codes()

sosid <- paste0("SOS", 1:nrow(sos_raw))

countries <- clean_countries(sos_raw)

This gives us a data frame in which each column is a country, and each row is a surveillance system—it's a true-false matrix of surveillance system by country.

Counts of Surveillance Systems by Country.

We can create a table showing the number of surveillance systems in each country.

by_country <- countries %>%
  summarise_each(funs(n = sum(as.numeric(.)))) %>%
  t() %>% # Transpose it so that countries are rows.
  as.data.frame() %>%
  mutate(iso3 = rownames(.)) %>%
  arrange(desc(V1))
names(by_country)[1] <- "n_systems"
knitr::kable(head(by_country, 20))

We'll also integrate country population for later.

library(curl)
pop_csv <- curl_download(url = "https://raw.githubusercontent.com/datasets/population/master/data/population.csv",
                         destfile = tempfile())
population <- read.csv(pop_csv, stringsAsFactors = FALSE)
population %<>%
  filter(Year == 2014) %>%
  select(iso3 = Country.Code, population = Value) 
by_country %<>%
  left_join(population) %>%
  mutate(systems_per_capita = n_systems / population)

by_country %>%
  select(iso3, systems_per_capita) %>%
  arrange(desc(systems_per_capita)) %>%
  head(20) %>%
  knitr::kable()

Plot data on map

The plots we're doing to do, in order, are:

  1. Number of surveillance systems
  2. Country population
  3. Surveillance systems per capita

Plots 4, 5, and 6 are the same three plots with color palette scaling log-transformed.

library(rworldmap)
sos_map <- joinCountryData2Map(dF = by_country, joinCode = "ISO3", nameJoinColumn = "iso3")
spplot(sos_map, zcol = "n_systems")
spplot(sos_map, zcol = "population")
spplot(sos_map, zcol = "systems_per_capita")

sos_map$log_n_systems <- log(sos_map$n_systems)
spplot(sos_map, zcol = "log_n_systems")
sos_map$log_population <- log(sos_map$population)
spplot(sos_map, zcol = "log_population")
sos_map$log_systems_per_capita <- log(sos_map$systems_per_capita)
spplot(sos_map, zcol = "log_systems_per_capita")


ecohealthalliance/sos documentation built on May 15, 2019, 7:56 p.m.